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ABSTRACT 


The  Cambridge  Project  is  a  cooperative  effort  by  a  number 
of  scientists  at  M.I.T.  and  Harvard;  its  purpose  is  to  make  the 
digital  computer  more  useful  and  usable  by  scientists  in  the 
basic  and  applied  behavioral  sciences,  and  in  other  sciences 
thao  have  similar  computing  problems.  This  Technical  Report 
descnbes  progress  during  the  half  year  beginning  in  July  1973. 
l he  Project  is  supported  by  the  Advanced  Research  Projects 
Agency  under  Contract  F30602-72-C-0001. 


The  most  notable  single  achievement  of  the  half  year  covered 
transfer  of  the  entire  Consistent  System  from  the 
old  Multics  computer,  which  was  a  Honeywell  645,  to  a  new  Multics 
computer,  a  Honeywell  6180,  and  the  subsequent  transfer  to  another 
6180  operated  by  the  Air  Force  Data  Services  Center.  In  :pite  of 
the  inevitable  differences  in  operating  systems  that  were 
encountered,  both  of  these  transfers  went  smoothly.  This  was  in 
large  part  due  to  the  planning  of  the  design  of  the  Consistent 
system.  This  design  was  intended  to  isolate  the  collection  of 

Drograms  from  the  details  of  the  time-sharing  system  in  which 
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1.0  SUMMARY 


The  Cambridge  Project  is  a  cooperative  effort  by  a  number  of 
scientists  at  M.I.T.  and  Harvard;  its  purpose  is  to  make  the 
digital  computer  more  useful  and  usable  by  scientists  in  the 
basic  and  applied  behavioral  sciences,  and  in  other  sciences  that 
have  similar  computing  problems.  This  Technical  Report  describes 
progress  during  the  half  ye*r  beginning  in  July  1973.  The 
Project  is  supported  by  the  Advanced  Research  Projects  Agency 
under  Contract  F30602-72-C-0001 . 

The  Project  has  two  main  goals:  first,  developing  pr^qrams 
and  other  computing  tools;  second,  combining  these  tools,  and 
others  borrowed  from  other  sources,  into  a  Consistent  System  of 
programs,  models,  and  data  that  behavioral  scientists  can  use 
on-line.  Most  of  the  effort  is  devoted  to  writing  and 
documenting  programs,  but  attention  has  been  paid  to  theoretical 
studies  of  statistical  techniques,  equipping  a  computer-based 
laboratory  for  studying  autonomic  conditioning,  and  other  related 
subjects. 

Section  2  of  this  report  describes  work  toward  the  second  of 
those  goals,  the  Consistent  System,  which  has  increasingly  become 
the  focus  of  the  Project's  attention  as  time  has  passed.  The 
remaining  sections  report  on  work  toward  the  other  goal,  the 
development  of  computing  tools:  Section  3  is  on  tools  for 
data-handl ing;  Section  4,  for  data-analysis ;  Section  5, 
modeling;  and  Section  6,  for  computer-controlled  experiments  and 
studies  of  human  factors  in  on-line  computation. 

The  most  notable  single  achievement  of  the  half  year  covered 
here  was  the  transfer  of  the  entire  Consistent  System  from  the 
old  Multics  computer  at  M.I.T.,  which  was  a  Honeywell  645,  to  a 
new  Multics  computer  at  M.I.T. ,  a  Honeywell  6180,  and  the 
subsequent  transfer  to  another  6180.  In  spite  of  the  inevitable 
differences  in  operating  systems  that  were  encountered,  both  of 
these  transfers  went  smoothly.  This  was  in  large  part  due  to  the 
design  of  the  Consistent  System,  a  design  that  was  intended  to 
isolate  the  collection  of  programs  from  the  details  of  the 
time-sharing  system  in  which  they  run. 

Work  will  continue  during  the  next  half  year  under  the  same 
contract. 


2.0  THE  CONSISTENT  SYSTEM 


The  project  has  continued  to  focus  more  and  more  on  the 
Consistent  System.  In  fact,  most  of  the  work  described  in  the 
other  sections  of  this  report  is  on  programs  that  are  already 
components  of  the  System,  or  are  intended  to  become  components  of 

The  System  is  a  collection  of  programs  that  are  intended  for 
interactive  use,  and  that  are,  moreover,  designed  so  that  a 
scientist  who  is  not  a  programmer  can  use  them  together.  It  runs 
within  the  Multics  time-sharing  system,  and  is  expected  to  become 
a  repository  not  only  of  programs,  but  of  models  and  sets  of 
data,  and  it  appears  potentially  applicable  to  uses  far  wider 
than  the  Behavioral  and  Social  Sciences,  for  whose  needs  it  was 
originally  designed. 

The  collection  now  consists  of  about  200  programs  -  some 
quite  small  and  some  that  are  whole  systems  in  themselves  -  which 
add  up  to  almost  a  million  words  of  object  code.  Early  in  the 
period  covered  by  this  report  the  collection  was  successfully 
transferred  from  the  old  Multics  computer  at  M.I.T.,  a  Honeywell 
645,  to  the  new  one,  a  Honeywell  6180.  Shortly  thereafter  the 
entire  collection  was  also  transferred  to  the  6180  operated  by 
the  Air  Force  Data  Services  Center. 

The  report  on  the  period  from  July  1972  to  January  1973  gave 
-  on  pages  2-1  to  2-22  -  an  extensive  account  of  the  System, 
including  its  general  organization,  its  background  and  goals,  and 
the  kind  of  consistency  at  which  it  aims. 


2.1  The  Present  Collection 

Of  the  200  programs  now  in  the  collection,  about  160  are  of 
interest  to  the  user  who  is  making  serious  use  of  the  System 
today;  the  others  are  waiting  for  companion  programs  that  will, 
for  example,  make  it  easy  for  him  to  prepare  inputs  for  them  or 
display  their  outputs.  These  160  programs  divide,  from  his  point 
of  view,  into  roughly  four  groups: 

(1)  The  prototype  of  Janus.  This  is  a  large  system  that  handles 
data  of  a  kind  common  in  fields  like  behavioral  science.  A 
typical  set  of  data  contains  information  about  a  number  of 
^entities  (e.g.,  people),  each  of  which  has  a  number  of 
attributes  (e.g.,  age,  sex,  years  of  schooling,  occupation, 
town  of  residence,  and  so  on).  Janus  has  some  unusual  powers, 
many  of  which  come  from  its  ability  to  handle  relations  between 
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such  sets  of  data.  For  example,  if  there  is  a  set  in  which  the 
entities  are  people,  and  another  in  which  the  entities  are  towns, 
a  relation  between  the  two  sets  can  be  used  in  computations  that 
depend  upon  the  attributes  of  the  town  in  which  each  person 
lives. 

(2)  TSP-CSP.  This  is  a  large  system  originally  designed  for 
batch  processing.  Its  full  name  is  "Time  Series  Processor  - 
Cross  Section  Processor".  While  originally  intended  for 
econometricians,  it  has  a  great  deal  of  statistical  power  that  is 
useful  to  investigators  in  other  fields.  It  includes: 

ordinary  least  squares  (i.e.,  multiple  linear  regression) 

weighted  least  squares 

least  squares  with  instrumental  variables 

residual  analysis 

extrapolation  or  forecast  analysis 

nonlinear  least  squares 

Bayesian  regression 

regression  with  autocorrelated  errors 

spectre i  analysis 

principal  components 

factor  analyzer  (varimax  method) 

correlation  analysis 

algebraic  operations  on  vectors  and  scalars 
matrix  arithmetic 

scatterplots  and  plots  of  time-series. 

(3)  Small  programs  that  work  on  numerical  arrays.  There  is  a 
collection  of  about  90  programs  that  accept  or  produce 
multidimensional,  rectangular  arrays  of  numbers.  This  part  c* 
the  collection  offers  the  investigator  who  is  not  a  programmer 
great  flexibility  in  numerical  calculations;  by  using  sequences 
of  these  programs  he  can  perform  computations  for  which  he  does 
not  find  convenient  provisions  elsewhere  in  the  collection. 
About  half  of  these  programs  are  statistical.  Others  do  matrix 
arithmetic,  extract  and  replace  subarrays,  plot  graphs  on  a  CRT 
or  typewriter  terminal,  and  so  on. 

(4)  Miscellaneous.  This  category  includes:  (a)  programs  that 
permit  the  nonprogrammer  to  define  a  "macro-command"  -  i.e.,  an 
agent  that  will  run  off  a  sequence  of  programs  for  him;  (b) 
doorways  to  other  systems  that  run  within  Multics  but  have  not 
been  maintained  by  the  Cambridge  Project  -  e.g.,  doorways  to  APL 
and  to  text  editors;  (c)  service  programs  that  permit  the  user  to 
list  the  files  he  has  in  storage,  delete  files  he  no  longer 
wants,  gain  access  to  another  user's  files  (if  the  other  user  has 
told  Multics  to  permit  him  to),  leave  the  Consistent  System,  and 
so  on. 
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From  the  point  of  view  of  the  nonprogrammer,  one  of  the  most 
important  aspects  of  this  col  lection  of  programs  is  their 
compatibility.  Janus  can  accept  arrays  of  numbers  from  T$F ,  TSP 
can  accept  them  from  the  array  programs,  which  can  accept  tnem 
from  Janus,  and  so  on.  Thus  the  investigator  can  use  the 
programs  together  in  a  way  that  lets  their  capabilities  reinforce 
each  other. 

The  project  is  at  present  devoting  most  of  its  effort  to 
expanding  and  improving  this  collection.  At  the  risk  of 
repeating  information  given  in  other  sections  of  this  report,  the 
main  areas  into  which  the  effort  is  going  are:  a  new  Janus  that 
will  replace  the  present  prototype;  a  new  language,  tentatively 
named  "DATA TRAN",  through  which  the  user  may  communicate  with  the 
new  Janus,  the  routines  now  in  TSP,  and  other  statistical 
programs  now  in  preparation;  a  collection  of  programs  for 
studying  passages  of  natural  text;  an  improved  set  of  programs 
for  graphics  on  CRT  terminals;  and  programs  for  modeling  of 
several  kinds. 


2.2  Documentation 

Principals:  Caroline  Lange,  Caroline  Thompkins,  A.  K.  Gaudreau 

At  least  one  of  the  members  of  the  Project's  faculty 
Advisory  Committee  is  of  the  opinion  that  the  Consistent  System 
is  the  best-documented  large  body  of  software  he  has  seen.  In 
any  case,  considering  that  this  is  a  system  still  under 
development,  the  thoroughness  of  the  documentation  can  justly  be 
regarded  as  unusual.  While  sheer  volume  is  no  guarantee  of 
quality,  it  tells  at  least  part  of  the  story:  there  are  now 
about  1850  pages  of  documentation,  1000  for  nonprogrammers,  and 
850  for  those  programmers  who  want  to  add  new  programs  to  the 
collection.  Those  figures  do  not  include  the  additional, 
archival  documentation  that  is  kept  on  file  for  later  use  by 
programmers  who  need  to  maintain  or  modify  existing  programs  or 
the  Substrate  on  which  the  collection  rests. 

An  important  addition  to  documentation  for  users  has 
recently  been  made:  a  draft  of  an  introductory  manual  for 
nonprogrammers  who  want  to  use  the  Consistent  System  has  been 
prepared  by  Caroline  Lange,  and  it  is  being  circulated  on  a  trial 
basis. 

The  cornerstone  of  the  remaining  documentation  for  users  is 
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the  Handbook  of  Programs  and  Data,  which  contains  a  description 
of  every  program  in  the  public  libraries  -  both  the  "a"  library, 
which  contains  programs  the*  have  been  formally  accepted  into  the 
collection,  and  the  "x"  library,  which  contains  programs  that  are 
still  in  an  experimental  status.  normally,  this  Handbook 
attempts  to  describe  the  program  thoroughly  from  the  point  of 
view  of  a  nonprogrammer,  telling  him  everything  about  it  that  he 
might  ever  want  to  know  and  perhaps  a  little  more  besides.  The 
main  exceptions  are  two  programs  that  are  systems  in  themselves, 
TSP  and  the  prototype  Janus:  each  of  them  has  its  own  user's 
manual . 

An  important  addition  has  also  been  made  to  the 
documentation  for  programmers  who  want  to  contribute  programs  to 
the  collection:  a  draft  of  a  manual  for  contributors  was 
prepared  by  John  Klensin  during  the  fall  and  is  being  circulated 
on  an  experimental  basis.  That  manual  serves  as  an  introduction 
to  the  fundamental  document  for  programmers,  namely  the  Handbook 
for  Programmers  which  describes  from  the  contributor's  point  of 
view  the  Substrate  on  which  the  collection  rests;  states  the 
conventions  that  contributions  to  the  collection  must  obey,  and 
the  additional  rules  they  should  obey  to  be  considered  "regular"; 
describes  the  file  Description  Schemes  that  have  been  officially 
accepted;  describes  the  "public  subroutines"  that  a  program  may 
invoke;  and  gives  the  standards  for  documentation  of  public 
programs  and  subroutines.  This  reference  book  has  grown 
considerably  during  the  past  half  year,  mainly  through  the 
introduction  of  the  descriptions  of  about  40  new  utility 
subroutines. 

To  summarise  activities  in  documentation:  the  Handbooks 

are  updated  periodically;  papers  have  been  written  and  presented; 
and,  several  tutorial  documents  have  been  published.  The 

Substrate  Logic  Manual,  mentioned  in  the  last  report,  has  been 
published  as:  "The  Multics  Substrate  Implementation",  Klensin, 
John  C.,  The  Cambridge  Project,  M.I.T.  Dec.  1973.  Also 

published  as  a  preliminary  version  was  "The  Beginner's  Manual  for 
the  Consistent  System",  Lange,  Caroline  S.,  The  Cambridge 
Project,  M.I.T.,  November  1973.  This  primer  contains  an 
introduction  to  the  Consistent  System  for  the  new,  nonprogramming 
user:  it  defines  the  elementary  concepts;  explains  how  to  enter 

and  leave  the  CS,  discusses  the  user/terminal  environment; 
explains  how  to  enter  data;  describes  the  service  commands;  and, 
how  to  create  and  run  'macros'.  A  more  technical  manual  for 
programmers  was  also  published,  written  by  John  C.  Klensin,  and 
entitled  "Programming  The  Consistent  System". 


The  following  papers  and  reports  have  been  published  and/or 
presented  during  this  report  period: 

1.  "Janus:  A  Data  Management  and  Analysis  System  for  the 

Behavioral  Sciences",  Jeffrey  Stamen  and  Robert  Wallace, 
which  was  released  after  its  presentation  at  the  August  ACM 
meeting  in  Atlanta,  Georgia. 

2.  "ANSI  TAPE  Support  on  Multics",  Godsell ,  S.,  The  Cambridge 

Project,  M.I.T.  4  Sept.  1973. 

3.  An  On-Line  Management  System  for  the  Secondary  Analysis  of 
Public  Opinion  Data",  Ross,  J.  Michael  and  Mela,  Wendy, 
paper  presented  at  the  Ninth  Annual  EDUCOM  Conference, 
Princeton,  N.J.,  10  October  1973. 


2.3  Utility  Subroutines  and  Support  Programs 
Principal :  John  C.  Klensin 

During  the  reporting  period,  work  nas  been  completed  on  thr 
utility  subroutines  discussed  in  the  last  Technical  Report. 
These  routines  provide  convenient  ways  to  access  and  create  files 
and  to  analyze  input  directions  for  particular  programs.  In 
addition  to  those  designed  during  the  last  reporting  period  for 
which  coding  has  been  completed  during  this  one,  a  complete  set 
of  utilities  for  the  labeled  array  description  scheme  "genarray" 
has  been  defined,  coded,  and  documented.  This  description  scheme 
will  be  the  basis  for  most  of  the  numerical  and  statistical 
programming  to  be  done  during  the  balance  of  the  project.  The 
programs  designed  to  deal  with  it,  in  addition  to  handling 
labeled  data,  are  also  somewhat  more  flexible  than  earlier 
routines  in  the  way  they  handle  arguments  and  return  values. 

In  addition,  a  set  of  utility  subroutines  has  been  written 
and  documented  to  provide  easier  handling  of  unformatted 
character  files  of  description  scheme  "char".  These  utilities 
permit  acquiring  these  files,  concatenating  them,  making  simple 
character  strings  from  them,  and  so  forth,  without  requiring  the 
programmer  to  have  a  detailed  knowledge  of  Consistent  System  file 
handling.  Similar  routines  for  numerical  files  are  included 
among  the  "genarray"  set. 

Besides  the  utilities  for  ordinary  programs,  a  collection  of 
utilities  has  been  specified,  and  largely  coded,  to  facilitate 
the  construction  of  Consistent  System  agent  programs.  These 


utilities  include  a  "tines 

runninq  other  programs,  and  raat  on  about  the  errors 

ft°or  the^userlthandr°responding^  tt  his  requests  with  respect  to 
errors  that  have  already  occurred. 

With  the  completion  of  the  error  and  agent 
the  first  portion  of  this  J  "ea?s  Pago  has 

runner  originally  constructed  *  m  provides  a  user  the 

been  extensively  revised.  This proqram  P  a  ser1es  of 

capability  of  constructing  .  to  resort  to  an  ordinary 

individual  commands  without  havinq  °n  specifications 

programming  language  to  do  so.  The  runner  permi  as  the 

of  comm? nd  names  and. argune  -  i  ma  time-sharinq  systems. 
"exec-com1  or  runoff  facilit  .  ,  control  statements, 

In  addition,  it  provides,  throuq  .  p  unconditional  iteration, 
facilities  for  condition  similar  programming  language-like 
conditional  branchir.q,  ,  provisions  for  creating  formal 
facilities.  It  al.o  nas  P™v  *  ^  translation 

parameters  and  local  temporar  provision  for  sending 

facilities  of  the  Consistent  System,  and  provis^  than 

rUtte™i°nai:eCThe’nd,esi;hPand  most  of  J*e  coding  of^this  routine 
progress  .STStoSfd'b.  "ted  during  the  next  few  months. 

This  simple  macro  .run*, r  Provides  series' '"o? 

procedure-oriented  ‘“hniqu  f  ^  ^  executedi  one  per  line, 

vritlTwhatever^control  stat^ents  are  needed  to  control^  the 
execution.  A  second  macro-like  facility  na  ^  a  s,ngle 

handling  nested  groups  of  c0  ?"  1  utilitiesy  and  preliminary 
command  execution  The  basic  ^imies^  ^  .P 

that^th^'facility^1  Itself ,  tentatively  called  the  "fn"  runner, 
will  be  completed  during  the  next  half  year. 

2.4  Substrate  and  Preprocessor 
Principal:  Robert  Sorrentino 

The  Substrate,  on  which  the  balance  ^^^ng^ 

depends,  continues  to  period.  A  Substrate  Logic 

found  by  users  duri n9  *he  P  the  Pinternal  interactions  of  the 
Manual  describing  comp  e  y  .  released  early  in  the 

reporting  Zl  who  are  interested  in 

it. 
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The  Consistent  System  was  moved  to  the  Multics  follow-on 
machine  during  July  with  little  difficulty,  due  to  the  stability 
of  the  Substrate  and  preparations  done  during  the  preceding  three 
or  four  months.  Later  the  System  was  moved,  with  even  less 
difficulty,  to  the  Multics  at  the  Air  Force  Data  Services  Center. 
The  fact  that  both  moves  went  smoothly  in  spite  of  inevitable 
differences  in  operating  systems,  and  the  fact  that  the  System 
has  continued  to  work  in  spite  of  inevitable  changes  in  the 
M.I.T.  Multics,  are  regarded  as  further  evidence  of  the  value  of 
a  Substrate  that  isolates  the  collection  of  programs  from  changes 
in  the  underlying  software. 

Other  work  on  the  Substrate  during  this  peHjjT  consisted 
mostly  of  extensive  timings  of  Consistent  System  j*?aa7i or  and  the 
beginning  of  redesign  and  recoding  of  critis^i  portions  of  the 
Substrate  to  take  better  advantage  of  hardens  features  of  the 
new  machine.  This  process  has  been  so.tr*£what  delayed  by  the  fact 
that  these  facilities  have  been  only  gradually  turned  on,  in 
spite  of  the  fact  that  the  user  transition  to  that  machine  was 
made  in  July.  The  process  of  recoding  and  testing  is  expected  to 
continue  into  the  spring,  after  which  a  new  and  final  edition  of 
the  Substrate  Logic  Manual  will  be  produced. 

Work  continues  on  the  preprocessor,  but  it  has  been  f"rther 
delayed  by  the  press  of  work  on  the  Substrate,  the  new  machine, 
and  similar  problems.  Its  completion  is  expected  during  the 
first  part  of  the  next  half  year. 


2.5  Conventions  and  Acceptance  Procedures 

Principals:  Susan  Godsell,  John  C.  Klensin, 

Jeffrey  Stamen,  and  Douwe  B.  Yntema 

The  procedures  for  accepting  programs  and  utilities  into  the 
System  continue  to  work  smoothly  -  so  smoothly,  in  fact,  as  to  be 
little  noticed  by  much  of  the  Project.  The  procedure  continues 
to  be  much  as  described  in  the  previous  annual  report,  except 
that  with  departure  of  Jeffrey  Stamen  from  the  Project  Staff  in 
the  fall,  Susan  Godsell  has  become  a  member  of  the  acceptance 
committee. 

The  library  continues  to  grow  with  more  than  400  programs 
and  subroutines  now  available  in  tie  system. 


2'G  Contacts  with  Users 
Principal:  Martin  Broekhuysen 


In  the  period  July  through  October  we  helped  with  the 
Iiultics  registration  of  six  new  projects  that  came  to  Multics 
with  the  primary  purpose  of  using  the  Consistent  System.  We  know 
that  other  persons  and  projects  began  in  this  period  to  use  the 
CS;  we  estimate  several  dozen.  The  six  included  five  who  work 
out  of  Harvard's  William  James  Hall  (behavioral -science  center): 
four  faculty  plus  the  user-consultant  group,  all  of  them  with 
considerable  experience  in  computerized  data  analysis;  and  one 
faculty  member  of  Harvard's  School  of  Architecture,  with  little 
previous  experience.  Each  of  these  was  given  as  much  of  the 
Project's  user  documentation  as  we  determined  they  could 
profitably  use,  and  more  or  less  time  in  initial  consultation  - 
ranging  from  an  hour  or  so  up  to  a  day  -  spent  on  some  particular 
problem  such  as  retrieving  the  contents  of  a  previously-existinq 
directory  or  enterinq  data  not  initially  on  punched  cards.  After 
an  initial  period  of  two  weeks  or  so  in  which  they  might  have 
several  questions  in  up  to  half  a  dozen  phone  calls,  none  of 
these  have  raised  significant  problems.  We  do  know  that  they  are 
continuing  to  use  the  system  —  are  logging  in  fairly  regularly. 
Also,  the  total  number  of  users  is  considerably  larger  than  this 
group.  He  hear  by  telephone  from  persons  who  were  using  the 
system  before  that  period  and  still  are;  we  have  members  of  the 
Central  Staff  in  frequent  contact  with  the  users  in  Washinaton; 
in  the  Executive  Office  of  Manpower  Affairs  of  the  Commonwealth 
of  Massachusetts;  at  the  University  of  Illinois;  and,  users 
enrolled  in  a  fall  term  course  in  the  Department  of  Urban  Studies 
at  M.I.T.  On  Janus  alone,  there  is  currently  an  average  of  about 
five  logins  per  day.  Judging  from  the  number  of  persons  known  to 
use  the  CS  regularly,  we  can  say  that  there  are  many  who  are 
getting  along  on  their  current  knowledge  and  the  user 
documentation,  without  needing  face-to-face  consultation  or  other 
assistance. 

We  know  enough  about  many  of  them  to  identify  some  research 
interests  of  the  group.  Three  areas  are  represented  by  three  or 
more  users:  medical-data  analysis;  budget  management  and 
resource-allocation  proDlems  (Air  Force  Data  Service,  the  Common¬ 
wealth,  the  urban-studies  course);  and  sociological 
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3.0  DATA  HANDLING 


3.1  Janus 


Principals:  Jeffrey  P.  Stamen,  Robert  M.  Wallace 


Contributors : 


Pamela  Hill,  Chat-Yu  Lam, 

Gary  Palter,  Dorothy  F.  Shuford,  Marcia  Siegel 


The  Janus  system  is  described  in  the  paper  "Janus:  A  Data 
Management  and  Analysis  System  for  the  Behavioral  Sciences"  which 
was  presented  at  the  1973  National  ACM  Conference  and  which  is 
also  ?  'ailable  as  a  technical  report  from  the  Cambridge  Proiect 
la™sJsr  ^so  described  extensively  in  previ  ,us  reports  so  tted 
by  the  Cambridge  Project. 

There  are  currently  two  versions  of  Janus,  a  prototype 
wh!ch  has  been  available  to  users  for  over  two  years,  and  Version 
7  Vhl^h,  currently  under  development.  Almost  all  work  over 
e  last  half  year  has  been  on  the  new  version  of  Janus.  At  the 
eginmng  of  this  time  most  of  the  design  and  many  of’the  basic 
routines  were  complete.  Now  there  are  over  twice  as  many 
routines  completed,  three  of  the  basic  data-editing  commands  a?e 
T  J9’  ,he  substructure  of  the  system  is  working  and  reasonably 
U  aJ°^er  three  major  commands  are  nearing  completion 
(define  attribute,  create_attribute ,  and  compute).  The  new 
version  of  Janus  will  be  available  soon  for  some  users,  and  will 
replace  the  Prototype  in  the  spring  of  1974. 

has  hl^nn^0^  Ja"US  „haS  changed  very  little  recently,  but 
has  had  considerable  use.  On  the  M.I.T.  Multics  system  more  than 

five  sessions  per  day  are  logged;  in  addition,  there  is  a  copy  of 

variety9  nf^T1  y  on.a  cities  system  at  the  Pentagon  for  a 
anl  ysls  f  * ^  rangins  trom  resource  allocation  to  survey 


3.2  SURVEIR  Data  Management  Project 
Principals:  Michael  Ross,  Wendy  Mela 

The  Project  s  effort  in  the  last  six  months  has  involved  the 
conversion  of  the  raw  data  input  (i.e.,  the  Multi  punched  Survey 

SP°trLr°J<^  f0™‘  and  atW1"3  the  relevant  Information 
to  the  Data-Descriptor  files.  This  program  reads  the 

Data-Descriptor  file  produced  by  the  codebook  conversion  and 
column-binary  tape  files  (these  files  are  sent  to  us  from  the 

Preceding  page  blank  , , 


Roper  Center's  RCA  machine  and  unfortunately  lead  to  serious 
technical  problems  on  the  IBM  370),  and  generates  a  Oanus  dataset 
for  Mu 1  tics  and  a  new  Data-Descriptor  File  for  the  retrieval 
programs  (SURVEIR)  on  Multics.  This  updated  data  descriptor  file 
contains  data-dependent  information,  such  as  the  existence  of 
multipunchcd  data  and  the  frequency  for  each  variable,  and  basic 
statistics. 

A  subset  of  these  surveys  has  been  processed  and  is  being 
tested  on  SURVEIR.  Refinement-  are  beino  made  in  the  codebook 
and  response  conversion  programs  in  order  to  handle  more  comolex 
questionnaire  formats.  By  the  end  of  the  project,  we  plan  to 
demonstrate  a  larger  subset  (approximately  26  surveys)  to  a  group 
of  social  scientists  who  rely  heavily  upon  secondary  analysis  of 
public  opinion  data  in  order  to  assess  the  adequacy  of  the 
system,  and  the  prospects  for  expanding  the  database,  and 
offering  a  user  service  as  a  part  of  the  Cambridge  Project. 
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4.0  DATA  ANALYSIS 


4.1  Project  Diana 
Principal:  Rosemarie  Rogers 

Contributors:  Elaine  C.  Franklin,  Edward  ,1.  McCabe, 

Kathleen  M.  O'Connell 

Special  contributor:  John  C.  Klensin 

The  major  part  of  the  work  on  Project  Diana  durinq  the  past 
half  year  has  been  devoted  to  programming.  The  work  of 
programming  the  Thesaurus  Manager  is  complete  and  the  programs 
are.  part  of  the  consistent  System.  The  thesaurus  that  we  have 
designed  (id  implemented  is  basically  a  hierarchical  structure  in 
which  concepts  are  related  in  a  parent-child  fashion.  The 
thesaurus  provides  for  synonym  recognition.  Terms  entered  in  the 
thesaurus  may  be  single  words  or  word  phrases.  The  two  programs 
for  creating  the  on-line  thesaurus  are:  ini t_diana  tables ,'  and 
build  thesaurus.  Five  programs  have  been  written  which  permit 
easy  updating  of  the  thesaurus:  add  concept;  add  synonym; 
cielete  concept;  del ete_synonym;  and  “  move  subtree. ~  These 
programs  will  be  most  useful  once  the  actual  document  encoding 
and  experimenting  has  begun,  since  they  will  allow  us  to  modify 
the  thesaurus  as  deemed  necessary  in  light  of  increased  knowledge 
and  data.  We  have  also  written  a  number  of  thesaurus  utilities 
for  easy  inspection  of  the  thesaurus  contents. 

The  program  package  for  text  and  query  analysis  and  document 
retrieval  is  also  complete.  It  consists  of  newly  written 
programs  and  other  programs  which  are  part  of  our  system  for 
semiautomatic  thesaurus  generation  (the  "clustering"  package)  and 
which  are  now  also  included  in  the  actual  text  manipulation 
package.  A  program  for  creating  the  negative  dictionary 
(create_neg_dict)  was  written.  The  package  for  text  analysis  and 
retrieval  consists  of  programs  for  processing  text  against  the 
nen-  ,ve  dictionary  (run_neg_dict) ,  thesaurus  look-up  programs, 
the  suffix  removal  program,  programs  for  matching  document 
vectors  against  query  vectors  and  for  outputting  the  references 
to  pertinent  documents.  Once  we  have  had  a  chance  to  check  our 
algorithms  in  an  actual  experimental  environment,  these  last 
processing  programs  may  be  modified  to  incorporate  the  results  of 
our  experimentation. 

A  report  on  the  Diana  system  as  it  currently  exists  is  now 
being  written  by  Rosemarie  Rogers  and  Elaine  C.  Franklin.  It 
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will  describe  ir,  detail  the  programs  just  mentioned  as  well  a> 
the  data  on  which  the  system  is  built  (the  negative  dictionary, 
the  thesaurus  for  American  documents  and  the  thesaurus  for  Soviet 
documents).  The  report  is  expected  to  appear  in  March; 
prepubl ication  copies  will  be  available  in  early  February  1974. 

In  addition  to  the  efforts  just  described,  we  have  continued 
to  integrate  into  the  Consistent  System  those  programs  which  were 
written  for  Project  Diana  before  all  the  guidelines  for 
programming  for  the  Consistent  System  had  been  delineated.  The 
Diana  programs  stem  text  and  merge  (from  the  clustering  package) 
are  now  part  of  the  Consistent  System. 

We  have  also  continued  our  analysis  of  clustering 
experiments  performed  in  late  1972  and  in  1973,  and  are  currently 
designing  a  further  set  of  such  experiments.  When  these  are 
completed  and  analyzed,  a  report  on  the  entire  work  will  be 
written  by  Rosemarie  Rogers.  Data  obtained  in  the  earlier 
experiments  have  been  incorporated  into  the  thesaurus  for 
American  documents,  end  a  preliminary  version  of  the  report  has 
been  written. 


4.2  TSP  Project 
Principal:  John  Brode 
Contributor:  Paul  Werbos 

As  described  in  the  last  semiannual  report,  TSP  is  now  fully 
operational  on  Multics  within  the  Consistent  System.  Automatic 
access  from  TSP  to  Janus  or  Consistent  System  files  is  included 
so  that  the  user  can  use  data  that  is  stored  elsewhere.  In 
addition,  the  user  can  now  access  Multics  files  or  tapes  so  that 
data  from  other  machines  and  sources  can  be  read'ly  brought  into 
TSP  and  then  stored  permanently  in  a  Janus  dataset  or  in 
Consistent  System  mnarray  files. 

There  are  more  than  30  matrix  commands  i'i  addition  to  this 
list  of  the  available  statistical  functions. 

TSP  flame  Purpose 

^rma  Autoregressive  moving-average  estimation  (Box-Je.ikins) 

armawt  Same  as  arma  but  with  correction  for  time  trend 
(exponential ) 
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Bayesian  regression 

Estimation  of  capital  stock  with  depreciation 

Theil  statistics  for  the  comparison  of  time  series 
(with  plot) 

Defines  constant  terms  for  nonlinear  estimation  or 
model s 

Corchrane-Orcutt  estimation  for  autoregressive 
error  term 

Correlation  matrix 

Covariance  matrix 

1  ean  of  several  attributes  for  each  entity 
Stores  series  of  suffixes 
Eigenvalies  of  a  square  matrix 
Bayesian  extrapolation 
Factor  analysis  (varimax  rotation) 

Inverse  fast  Fourier  transform 
Fast  Fourier  transform 

Extrapolation  of  regression  results  with  Theil 
statistics  and  plot 

Forms  cp  matrix  (correlation  or  covariance  matrix) 
for  olscp 

Stores  Polish  string  representation  of  a  formula 
Produces  an  offline  graph 

Hildr eth-Liu  estimation  for  autoregressive  error 
term 

Instrumental  variable  estimation  (two  stage  least 
squares) 

Creates  a  regular  time  series  using  data  from 
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irregular  intervals 


lsq 

Nonlinear  least  squares  estimation 

nanecp 

Stores  names  of  attributes  in  a  "cp"  matrix 

ols 

Least  Squares  (one-pass  estimation) 

olscp 

Least  Squares  from  a  "cp"  matrix 

ol  sq 

Least  Squares  (two-pass  estimate  using  ortho¬ 
normalization  -  Gram-Schmidt) 

o1 sqwt 

Least  Squares  weighted  (two-pass) 

olswt 

Least  Squares  weighted  (one  pass) 

param 

Defines  parameters  for  nonlinear  estimation 
or  models 

parm 

Posterior  distribution  of  linear  combinations  of 
coefficients 

pdl 

Distributed  lag  estimation  using  polynomial 
interpolation 

plot 

Omine  scatter  graph 

prcomp 

Principal  components 

prob 

Posterior  probabilities  of  linear  combinations 
of  coefficients 

random 

Returns  a  random  attribute  distributed  N(0,1) 

res  id 

Bayesien  residual  analysis 

signt 

Nonparametric  sign  test 

solve 

Solves  a  system  of  linear  equations 

spectr 

Spectral  analysis 

srank 

Spearman  rank  correlation  test 

tscorc 

Two-stage  Ccrchrane-Orcutt  estimation 

tshilu 

Two-stage  Hildreth-Liu  estimation 
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tsplot  Linear  plot  (on-line)  of  time  series 
usecp  Sets  default  name  for  "cp"  matrix 

user  Doorway  to  user-written  subroutines 

utest  Mann-Whitney  U-test 

varian  Analysis  of  variance 


4.3  DATATRAN 
Principal:  John  Brode 

Contributors:  Jeffrey  Stamen,  Robert  Wallace 

The  DATATRAN  language  has  been  conceived  as  a  means  of 
separating  the  user  from  the  problems  of  passing  from  one  program 
or  system  to  another.  Within  DATATRAN,  all  statistical  programs 
and  all  matrix  operations  will  be  considered  as  single  or 
multivalued  functions.  The  user  will  be  able  to  nest  these 
functions  as  well  as  include  them  in  an  arbitrary  algebraic 
expression.  The  user  will  not  have  to  be  concerned  with  whether 
all  of  the  functions  being  used  are  within  one  or  another 
package.  Any  transferring  of  control  or  data  will  be  done 
implicitly  without  user  attention. 

DATATRAN  deals  with  the  expression  found  to  the  right  of  the 
equal -sign  (or  replacement  operator  in  APL).  There  is  no  limit 
to  the  length  and  complexity  of  expressions.  Subexpressions  can 
be  nested  within  expressions  by  using  parentheses.  For  example: 

stdev(log(5+score/mean(score) ) ) 

When  calling  a  function,  the  user  can  specify  whether  no 
print,  some  print,  or  all  print  is  desired.  For  example: 

regress(residuals  of  regress  (A  on  X  Y  Z  with  print  all)  on  X  Y  Z 

with  print  some) 

On  the  inner  call  to  regress,  all  results  are  printed  but  on 
the  outer  call  only  partial  results  appear  at  the  console.  The 
user  can  specify  leads  and  lags  by  putting  "t"  plus  or  minus  the 
number  of  periods  to  be  led  or  lagged  in  parentheses  after  the 
attribute  name.  For  instance: 
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regress(A  on  X(t-l)  Y(t-l)  Z(t-l)) 

gets  a  regression  explaining  A  by  the  values  of  X,  Y,  and  7  for 
the  previous  period. 

.  3  Is-  further.Possible  to  save  any  of  the  results  of  a  call 

to  a  function  by  using  a  save  phrase  in  the  call  to  the  function 
A  save  Phrase,  at  its  simplest,  is  the  word  "save"  followed  by  a 
key  word  (different  for  every  function)  that  indicates  what  is  to 
examp?e6d  ^  attnbute’  number»  or  matrix  as  in  the  following 

era  independent_data  as  matrix(log(data) .time) 
cm  matrixjiiult( inverse  (cross_products(independent  data) 
save  inverse_xx)matrix  mult  (transposef independent  data) 
log(dependent_attributeJ))  - 

As  of  the  end  of  the  year,  the  DATATRAN  language  has  been 

beenlffi^ichTHe  bUJk  the  COcHng  t0  imPlement  this  language  has 
been  finished  and  elementary  expressions  can  be  run.  It  is 

the'endlf  ApJlf  1974!'re  U"9Ua9e  SyStCn  Wi”  be  °',erstiona1  ^ 


4.4  Economic  Application  of  Spectral  Analysis 

Principal :  Robert  F.  Engle 

Introduction 

This  project  began  in  September  1971  and  has  just  now  been 

Cfaci  ities1"  J973-  The  °Vera11  90315  Were  t0  Program 

!  f  performing  spectral  analyses  of  particular 
interest  to  economists,  evaluate  these  procedures  in  the  context 
of  economic  data  and  problems,  and  explore  possible  empirical 
applet, °ns  as  only  in  this  way  can  the  usefulness  o7  the 

nlnnnd»^u  bf  Il!ny  d?cume"ted-  This  report  will  discuss  work 
along  each  of  these  lines. 

Mnn*pIra*?HinUJrJr  tbe  Pr°grams  are  complete  and  well  documented. 

tel?  ?  16S  1ndlfated  that  the  small  sample  properties  of 
these  estimators  are  better  than  often  anticipated,  and 
applications  of  these  techniques  to  estimation  of  investment  and 

economics011  funct1ons  1ndicate  the  promise  of  spectral  methods  in 
Programs 

The  programs  are  a  set  of  subroutines  which  in  combination 
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compute  spectral  and  cross-spectral  estimates,  or  in  connection 
with  a  regression  procedure,  compute  efficient  estimates  of  a 
linear  regression  with  serial  correlation  (the  Hannan  estimator), 
or  band  spectrum  regressions.  These  subroutines  are  all  called 
by  a  master  subroutine  complete  with  defaults  so  that  the  user 
need  know  nothing  about  spectral  analysis  in  order  to  obtain  the 
analysis.  Yet,  the  subroutines  are  flexible  so  that  a 
sophisticated  user  can  generate  his  own  techniques. 

TJe  ^ser  interface  as  well  as  the  component  parts  are 
described  in  a  manual  to  be  available  soon.  This  manual  contains 
a  careful  introduction  to  the  interpretation  and  understanding  of 
the  tools.  Coupled  with  this  is  a  more  advanced  paper,  Engle 
descr1bing  the  relation  between  spectral  methods  and  time 
domain  methods  (distributed  lag  and  Box-Jenkins  procedures)  as 
well  as  the  justification  for  the  procedures  used  in  the  spectral 
computations.  The  paper  is  still,  however,  a  tutorial  piece,  and 
can  be  read  by  a  user  without  any  knowledge  of  spectral  analysis. 


Testing  of  the  Estimators 


The  desirability  of  correcting  for  serial  correlation  in  a 
linear  regression,  in  order  to  obtain  efficient  estimates,  is 
widely  recognized.  However,  when  a  specific  form  for  this  serial 
correlation  is  not  known,  the  principle  estimator  which  can 
guarantee  asymptotic  efficiency  is  a  spectral  estimator, 
originally  proposed  by  Hannan.  It  is  rarely  used,  however 
partly  because  the  programming  is  difficult  and  partly  because  it 

presumably  has  far  worse  finite  sample  behavior  than  asymptotic 
behavior. 

Since  this  estimator  is  so  desirable  theoretically,  it  was 
included  in  the  spectral  package.  In  order  to  examine  the  finite 
sample  properties  of  the  estimator,  Roy  E.  Gardner  from  Cornell 
University  and  Robert  F.  Engle  performed  an  extensive  Monte 
carlo  test  in  a  variety  of  economically  relevant  environments. 

The  conclusions  described  in  Engle  and  Gardner  (4)  indicate 
that  when  serial  correlation  is  of  a  relatively  simple  form,  then 
the  Hannan  estimator  is  virtually  equivalent  for  samples  of  size 
bo  or  more.  When  serial  correlation  is  more  complicated  and 
severe,  the  Hannan  estimator  has  roughly  twice  its  asymptotic 
variance  at  a  sample  of  size  100.  In  all  cases,  the  spectral 
estimators  appeared  to  be  unbiased  for  sample  sizes  over  50 
There  are  many  qualifications  to  these  results  described  in  the 
original  paper  but  the  conclusions  are  clear:  that  for 
economically  relevant  sample  sizes  (US  data  post-WWII;  quarterly 
or  monthly  data)  these  estimators  are  quite  well-behaved. 
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Economic  Applications 


A  paper  on  band  spectrum  regression,  Engle  (1),  describes 
sensible  economic  applications  of  spectral  analysis  to  problems 
of  estimating  regressions  subject  to  seasonality,  errors  in 
variables,  or  misspecification  in  different  frequency  bands.  A 
test  statistic  is  derived  for  testing  the  hypothesis  that  the 
regression  coefficients  are  not  stable  across  frequency  bands. 

In  this  paper,  an  application  to  the  estimation  of 
consumption  functions  is  examined.  If  the  permanent  income 
hypothesis  is  valid  in  its  simplest  form,  then  the  marginal 
propensity  to  consume  out  of  transitory  income  should  be  smaller 
than  out  of  permanent  income.  Identifying  the  high  frequencv 
components  as  transitory  income,  we  should  reject  the  hypothesis 
that  the  regression  coefficients  are  the  same  at  different 
frequencies.  In  fact,  we  are  not  able  to  reject  this  hypothesis 
and  thus,  at  least  in  this  simple  model,  the  permanent  income 
hypothesis  receives  no  support. 

In  a  second  application  of  these  techniques,  an  investment 
function  was  estimated  by  Duncan  K.  Foley  andR.F.  Engle. (3)  This 
function  was  derived  from  a  theory  of  macroeconomic  behavior 
based  upon  supply  limitations.  The  data,  however,  suffered  from 
errors  of  observation  which  were  assumed  to  lie  within  particular 
frequency  bands.  The  model  was  estimated  using  the  technique  of 
band  spectrum  regresssion  with  remarkable  success,  suggesting  the 
usefulness  of  spectral  techniques  for  macroeconomic  applications. 

Bibl iography: 

1.  Engle,  Robert  F.,  "Band  Spectrum  Regression",  International 

Economic  Review,  1974. 

2.  Engle,  Robert  F.,  'An  Introduction  to  the  Spectral  Analysis 

Capability" ,  mimeo. 

3.  Engle,  Robert  F.,  and  Duncan  K.  Foley,  "An  Asset  Price  Model 

of  Aggregate  Investment",  submitted  for  publication. 

4.  Engle,  Robert  F.,  and  Roy  Gardner,  "Some  Finite  Sample 

Properties  of  Spectral  Estimators  of  a  Linear  Regression", 
M.I.T.  Working  Paper  Mo. 122,  December  1973. 


5.0  MODELING 


5.1  The  General  Implicator 
Principal:  Ithiel  de  Sola  Pool 
Contributor:  Shahriar  Ahy 

The  semiannual  report  of  January-June  1973  discussed  the 
purpose  and  structure  of  the  General  Implicator  and  the  status  at 
that  timi.  Since  that  report  was  written,  a  technical  paper 
Text  Representation,  Text-Data  Management,  and  Text  Model inq 
with  the  General  Implicator"  (  Ahy,  Shahriar  and  Ithiel  de  Sola 
Pool,  The  Cambridge  Project,  M.I.T.  Sept.  19730  has  been 
produced  in  draft  form  and  is  in  the  process  of  final  draft  for 
publication.  The  paper  covers,  rather  completely,  these  four 
topical  areas: 

1*  £  statement  on  the  purpose  of  the  General  Implicator 

Project.  K 

2.  An  outline  of  how  the  General  Implicator  looks  in  its  first 
implementation  in  the  Cambridge  Project  Consistent  System  on 
the  Multics  host  computer. 

3.  A  theoretical  statement  of  the  relation  between  the  General 
Implicator' s  representation  of  English  statements  and  those 
formal  representations  of  meaning  that  have  proved  useful  in 
other  artificial  intelligence  research.  This  section 
represents  our  second  thoughts,  or,  in  other  words,  the 
insights  we  have  acquired  in  doing  the  first  version  of  the 
General  Implicator. 

4.  A  brief  statement  of  next  steps  in  the  research. 


5.2  Discourse 
Principal:  Wren  McMains 

Urban  planners  and  designers  work  in  a  tentative  and 
exploratory  way.  They  describe  the  environment  into  which  they 
intend  to  intervene.  They  transform,  hypothetically  of  course, 
that  environment  because  of  certain  attitudes  they  have  developed 
toward  it  or  because  of  the  attitudes  of  others.  They  display 
the  transformed  environment  to  examine  it  and  they  often  test  the 
transformed  environment  according  to  criteria  which  they  have 
developed  to  measure  its  performance.  Often  after  one  such 
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^  e’  nth®  Planner  displays  the  results  of  his  efforts  to  groups 

he  'l  wo[klng’  tryin g  to  make  them  understand  the 

alternatiyes  he  has  created,  and  how  well  they  perform,  as  well 
as  to  discuss  the  characterization  of  the  problem  which  his 
alternatives  and  tests  imply.  Frequently  the  results  nf  thic 
examination  may  result  in  the  problem  being  formulated  anew  and 
in  the  description  being  redone.  In  order  to  improve  the 

H^rfPt1°nt,°rVt0  ch?nge  the  need  arises  for  new  sources  of 

data  from  which  conclusions  can  be  drawn. 

an,  Ihe  discourse  language  has  been  developed  for  the  desiqner 
and  planner  as  a  tool  which  could  keep  pace  with  his  tentative 
and  exploratory  way  of  working.  It  provides  the  desiqner  with 
ways  of  describing  environments  as  he  chooses  rather  9than  iri 
specific  types  of  descriptions?  However?  s1n« 

de?“  Stion?aySitS“h«  k"  ,mpor‘ant  .r0,e  most  environment 
descriptions,  it  has  become  a  descriptor  which  Discourse  allows 

to  appear  implicitly  in  other  descriptions.  In  additiontoits 
bv  whi-h6  lh  1°ca^10nal  descriptions.  Discourse  provides  a  meaL 
sLcif^hi^!  d®sig?er  can  describe  his  own  transformations,  can 
of  h  u  displays,  or  can  test  the  environment  in  the  ways 

of  his  own  choosing  rather  than  in  suggesting  specific  types  of 
transformations,  rigidifying  the  displays,  or  prespecifyinq 
environmental  performance  criteria.  Additionally,  Discourse*  is 
designed  to  be  convenient  to  use  and  in  keeDino  with  tho 
designer ' s  thinking  and  conventional  modes  of  expression. 

ODeration1anf  '™>nthS  haVe  b!len  devoted  t0  documenting  the 

fS?nd  ?h1,  L  ???nUrSe  coTa"ds-  S1nca  adequate  help  was  never 

round,  this  has  been  a  part-time  effort.  The  new  reference 

manual  is  not  complete,  but  parts  of  it  are  available-  in 

ePaHvCU  97P  deTip^nS  f  COmmands  that  have  b^en  addef  sin  " 

ear!y  1972  Reprints  of  parts  of  the  old  manual  that  cover  the 

other  commands  are  being  made  available  in  a  consistent  form  but 
remain  to  be  updated  to  reflect  current  operation. 

The  Discourse- IMLAC  conmunicator  was  completed.  It  allows  a 
Discourse  user  utilizing  an  IMLAC  terminal,  to  see  a  contlnSIllJ 
up-to-date  display  of  several  attributes  while  he  gives  Discourse 
commands  just  as  if  he  were  using  a  nongraphic  Se  ?!is 
a  f0,,  Prevents  interrupting  his  train  of  thought  just  to  issue 
map  or  other  display  commands.  The  communicator  also  provides 
graphic  input  of  both  locational  and  numerical  information 

Several  new  commands  were  added  during  the  period;  however 
the  development  was  funded  by  other  sources  L  Hi 

interesting  of  these  being:  sources.  The  most 
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allocate 


Distributes  one  stt  of  charvar  (CHARacteri sties 
which  VARy  with  location)  values  to  another  set  of 
locations  (e.g.,  employment  to  residential 
locations) 


accessibility  Allows  testing  accessibility  to  various  kinds  of 
services  as  wel1  as  activity  clustering 

map_values  Maps  charvar  values  (the  map  command  maps 
attribute  locations) 

categorize  Provides  another  way  to  look  at  charvar  values  (A 
special  feature  also  allows  it  to  be  used  to 
produce  color  maps,  either  with  the  School  of 
Architecture  and  Planning  color  scope  or  on  a 
bi-color  offset  printing  from  plates  prepared  on  a 
IMLAC.) 


separation  In  addition  to  being  a  global  version  of  the  basic 
nearest  command,  provides  for  mean  and  maximum 
separations  as  well  as  moments 


store  state  Together  with  "restorestate" ,  allows  for  fast 
switching  between  various  design  states.  In 

addition,  the  restore_state  program  curnal 

provides  a  way  to  return  to  the  state  Discourse 
was  in  when  a  Multics  crash  occurred;  however, 
since  this  feature  results  in  additional  storage 
costs  it  must  be  explicitly  requested  by  the  user. 


For  the  Discourse  that  becomes  a  permanent  part  of  the 

Consistent  System,  the  remaining  tasks  are  to  be  completed: 

(1)  Update  any  old  command  descriptions  that  do  not  correctly 
reflect  present  operation. 

(2)  Documentation  of  internal  operation,  at  least  to  the  point 
that  new  programmers  would  have  sufficient  information  to 
add  new  commands. 

(3)  Minor  improvement  in  the  interface  between  Discourse  and 
other  Consistent  System  programs  and  systems.  Present 
communication  is  between  Discourse  arrays  and  Consistent 
System  mnarrays.  Users  have  found  this  awkward  when  they 
want  to  have  Consistent  System  routines  operate  on 
information  stored  in  charvars,  since  they  must  first  format 
it  into  arrays.  We  now  have  enough  experience  with  the 


25 


steps  they  most  often  do  to  make  the  interface  as 
transparent  as  possible;  this,  we  think,  is  important  if 
users  are  to  work  freely  in  the  Consistent  System. 


6.0  HUMAN  FACTORS 


6.1  A  Modular  Computer  System  for  che  Study  of 
Autonomic  Behavior 

Principal:  Craig  Fields 

This  project  has  been  described  in  detail  in  our  previous 
reports.  We  wish  to  report  that  this  project  has  merited 
funding,  separately  from  the  Cambridge  Project,  under  its  own 
contract.  It  is  rewarding  that  progress  has  been  made  in  this 
endeavor  and  we  (the  Cambridge  Project)  are  pleased  to  have  been 
involved  it  it. 


6.2  Simulation  Gaming  as  a  Behavioral  Laboratory 
Principal:  Peter  G.  W.  Keen 

During  the  summer,  the  large-scale  management  simulation 
game,  developed  at  the  Harvard  Business  School  by  a  research  team 
which  included  P.  G.  W.  Keen,  was  documented  as  the  first  step 
towards  implementing  it  on  Multics.  Since  that  time  several 
improvements  have  been  made  in  the  game  (at  Harvard);  these 
changes  will  be  incorporated  into  the  documentation  and  it  is 
planned  to  begin  conversion  to  Multics  by  the  late  spring.  Funds 
are  being  requested  from  several  outside  sources;  the  full 
implementation  of  the  game  will  require  about  an  18-month  effort 
and  it  is  planned  to  coordinate  this  effort  at  M.I.T.  Sloan 
School,  using  technical  support  from  the  Cambridge  Project. 

When  the  game  is  operational,  it  will  be  made  available  to 
other  colleges  and  used  as  a  basis  for  comparative  experiments  on 
the  impact  of  information  structure  and  systems  on  participant 
behavior,  problem-solving  strategies,  and  the  use  of  analytic 
methods  and  models  to  support  decision-making. 

As  part  of  the  documentation  effort  during  the  summer,  this 
researcher  also  began  several  working  papers  describing  both  the 
game,  its  use,  and  its  implications  as  a  behavioral  laboratory. 

Final  drafts  of  these  papers  will  be  made  available  around 
March  1,  1974. 
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6.3  IMLAC  Graphics 

Principals:  Nicholas  Negroponte,  M.  S.  Miller 

A  complete  system  has  been  implemented  which  installs  and 
maintains  on  the  Architecture  Machine  facility  (M.I.T.  Department 
of  Architecture)  IMLAC  graphic  terminal  core  images.  Upon 
request,  these  images  may  be  transmitted  over  switched  telephone 
line*;  at  1200  baud  to  bootstrap  the  terminals.  Components  of  the 
system  include: 

1.  An  Architecture  Machine  program  to  read  and  file  a  paper  tape 
containing  the  "block  loader"  that  always  prefaces  an  actual 
core  image. 

2.  An  Architecture  Machine  program  to  read  and  file  a  paper  tape 
containing  the  core  image  of  an  IMLAC  program.  This  tape 
must  be  in  standard  "block  absolute"  format  with  interrecord 
gaps  consisting  of  at  least  two  null  tape  characters. 

3.  An  Architecture  Machine  program  to  monitor  the  half-duplex 

telephone  interface  (connected  to  a  Bell  202C  modem)  and 
transmit  the  block  loader  followed  by  the  core  image 

requested.  An  incoming  ring  is  automatically  answered,  and 
"handshaking"  protocols  are  initiated  to  raise  the  carrier 
signal  so  that  d  imary  data  may  be  sent  to  the  Architecture 
Machine  and  supervisory  data  may  be  sent  from  it.  A 
character  string  identifying  the  core  image  is  read  with 
standard  provision  for  "character  erase"  and  "line  kill". 
When  the  string  is  terminated  (by  receipt  of  the  space 
character),  the  data  directions  are  reversed,  and 

transmission  to  the  terminal  begins.  When  it  is  complete, 
hangup  occurs  and  the  program  awaits  another  request.  All 
errors  (failure  to  establish  carrier,  file  not  found, 
disconnect,  etc.)  cause  hangup. 

4.  An  IMLAC  program  to  monitor  the  keyboard,  transmit  a  typed 
character  string  and,  upon  transmission  of  the  space 
character,  read  into  high  core  the  block  loader  and  transfer 
control  to  it.  This  program  is  implemented  in  32 
instructions  and  data  words  as  a  Read  Only  Memory  (ROM)  so 
that  a  programmer  console  is  not  necessary  to  repair  the 
program  in  case  of  accidental  damage. 

5.  An  IMLAC  program  to  read  a  core  image  and  transfer  control  to 
it  upon  completion.  This  is  the  block  loader.  As  the  core 
image  is  read  in,  checksums  are  computed  to  verify  correct 
transmission,  although  the  current  block  loader  simply  halts 
if  an  error  is  detected. 
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The  system  has  been  thoroughly  tested  with  the  Cambridge 
Project  IMLAC  located  at  The  Architecture  Machine  facility  and 
works  flawlessly.  Successful  bootstrapping  of  another  terminal 
at  575  Technology  Square  has  also  been  accomplished.  Other 
attempts  at  Technology  Square  and  at  Harvard  have  failed  due  to 
faulty  modems,  ROMs,  or  asynchronous  interfaces  --  all  at  the 
terminal  end.  Clearly,  each  terminal  system  must  be  checked  out 
and  made  to  function  correctly. 


